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Abstract: A new measurement method for the spatial distribution of neutron beam flux in boron neutron 
capture therapy (BNCT) is being developed based on the two-dimensional Micromegas detector. To address 
the issue of long processing times in traditional offline position reconstruction methods, this paper proposes a 
field programmable gate array (FPGA) based online position reconstruction method utilizing the micro time 
projection chamber principle. This method encapsulates key technical aspects: a self-adaptive serial link 
technique built upon the dynamical adjustment of the delay chain length, fast sorting, a coordinate-matching 
technique based on the mapping between signal timestamps and random access memory (RAM) addresses, and 
a precise start point-merging technique utilizing a circular combined RAM. The performance test of the self- 
adaptive serial link shows that the bit error rate of the link is better than 10°’ at a confidence level of 99%, 
ensuring reliable data transmission. The experiment utilizing the readout electronics and Micromegas detector 
shows a spatial resolution of approximately 1.4 mm, surpassing the current method’s resolution level of 5 mm. 
The beam experiment confirms that the readout electronics system can obtain the flux spatial distribution of 
neutron beams online, thus validating the feasibility of the position reconstruction method. The online position 
reconstruction method avoids traditional methods, such as bubble sorting and traversal searching, simplifies 
the design of the logic firmware, and reduces the time complexity from O(n’) to O(n). This study contributes 
to the advancement in measuring neutron beam flux for BNCT. 
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1. Introduction 


Boron neutron capture therapy (BNCT) is a progressive radiation therapy technology that selectively 
eliminates tumors at the cellular level [1-2]. With the advancements in accelerator-based neutron sources in 
recent years, BNCT has overcome the previous nuclear safety risk associated with traditional reactor-based 


neutron sources [3-5]. This progress is increasingly fostering prospects for its clinical application in hospitals. 


The spatial distribution of neutron beam flux is a critical parameter in the physical and clinical dosimetry of 
BNCT, because it fundamentally influences the precision of radiation dosage. However, there is currently no 
internationally standardized measurement method for this parameter. Traditional methods, such as point-by- 
point scanning with active detectors or the spatial arrangement of passive detectors, suffer from drawbacks 
such as long measurement times, complex operations, and low accuracy [3—9]. 

The Micromegas (Micro Mesh Gaseous Structure) detector is a two-dimensional micropattern gaseous 
detector [10]. It offers fast two-dimensional beam imaging via a single measurement [11]. In this study, 
Micromegas detectors are adopted to measure the spatial distribution of neutron beam flux in BNCT. The 
measurement system, consisting of the Micromegas detector and readout electronics is depicted in Fig. 1 [12— 
13]. The Micromegas detector uses a ™ LiF film as the converter, initially engaging in the °Li(n, «)°H reaction 
with neutrons. The resultant charged particles undergo primary ionization and sequential avalanche 
amplification in the drift and amplification gaps, respectively. Finally, current signals are induced on anode 
readout strips. The Micromegas detector used is equipped with 192 readout strips in the X-direction and Y- 
direction. The 384 readout strips are connected to the readout electronics system through several micro coaxial 
flat cables. The readout electronics are designed as a modular structure in the PXI Express (PXIe) platform, 
including 12 modular data acquisition boards (DABs) and 1 clock command board (CCB) [13]. The DABs 
amplify, digitize, and shape the current signals from the 384 detector strips. The CCB is located in the system 
timing slot. It distributes the system clock and the start and stop acquisition commands to 12 DABs via star 


buses of the chassis backplane. 
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Fig. 1 Measurement system consists of the Micromegas detector and readout electronics. 

The rapid acquisition of measurement results is important for improving equipment practicality and 
measurement efficiency. Although the Micromegas detector is utilized to shorten the data acquisition process, 
the data analysis process can still be time-consuming. This is because conventional readout electronics systems 
typically only handle data acquisition, with the collected data being analyzed offline through software [14-18]. 
Several minutes are normally needed to analyze the large volume of data, which does not meet user 
requirements. The major time-consuming part of offline software data processing is particle position 
reconstruction. To address the issue, this study utilizes the parallel computing characteristic and real-time 
capabilities of a field programmable gate array (FPGA) to perform online particle position reconstruction. 

Two particle position reconstruction principles are commonly employed when the Micromegas detector is 
used for particle position measurement: the charge centroid principle and the micro time projection chamber 
(uTPC) principle [19-21]. The charge centroid principle reconstructs the hit position by calculating the average 
of the strip positions weighted by the strip charge. The nTPC principle treats the Micromegas detector as a 
micro time projection chamber. It combines a group of signals caused by a single incident particle and merges 
them into the signal with the latest arrival time within the group. The readout strip corresponding to the signal 
with the latest arrival time is considered as the entry point of the particle [19, 20]. This process is called start 
point mergence in this study. 

Compared with the charge centroid principle, the pTPC principle has a simpler calculation process if it is 


implemented in an FPGA. Furthermore, multiple studies have shown that the charge centroid principle is 


suitable for use in cases in which charged particles are approximately emitted vertically, and its accuracy of 
position reconstruction rapidly deteriorates as the emission angle increases. By contrast, the uTPC principle 
exhibits excellent position reconstruction accuracy across the entire angular range [19, 20]. Secondary charged 
particles resulting from °Li(n, «)*H neutron interactions are randomly emitted at a 47 solid angle. According to 
these considerations, the uTPC principle is chosen for neutron position reconstruction in this study. 

According to the TPC principle, the particle position reconstruction involves operations such as data 
aggregation from multiple DABs, signal time sorting, start point merging, and XY coordinate matching. Due 
to the high neutron flux, data acquisition only takes a few seconds. After data acquisition is completed, the 
particle position reconstruction process is initiated, and the reconstructed data are uploaded from the CCB to 
the chassis controller. 

Computational software for general computers often includes well-developed sorting and searching 
algorithms. However, these algorithms are not suitable for direct porting to FPGAs. On one hand, an FPGA has 
limited on-chip memory resources compared with general computers, and its underlying architecture differs 
from that of a central processing unit. Therefore, the FPGA is not proficient in sorting and searching operations 
involving large amounts of data. Implementing these software algorithms directly on an FPGA would lead to 
complex firmware design. On the other hand, the algorithms provided by conventional software often have 
high time complexity. For instance, typical algorithms such as bubble sorting and traverse search have a time 
complexity of O(n’) [22]. If these software algorithms are directly ported to an FPGA, they may not fully 
leverage the hardware acceleration advantages of the FPGA. 

To sum up, the demand for rapidly obtaining measurement results poses a challenge to the readout 
electronics system in terms of FPGA-based online particle position reconstruction. This paper will fully utilize 


the characteristics of the FPGA and propose a hardware-friendly online position reconstruction method. 
2. Online hardware position reconstruction method 


2.1 Data aggregation built on the self-adaptive serial link technique 


The primary step in aggregating data is to establish the transmission link between the DABs and the CCB. 
The CCB serves as the main site for online position reconstruction. The DABs and the CCB are connected via 
serial star buses. The clock phase at both ends may vary during different power-on periods, which poses a 
challenge for stable data transmission over the serial link. To address this issue, this study proposes a self- 
adaptive serial link by dynamically adjusting the delay chain length. Fig. 2 (a) shows the composition of the 
link. 

The data aggregation link works in full-duplex mode. Read requests are transmitted as high level voltage 
pulses via a PXIe DSTARB star bus, and the data are aggregated via a PXIe_ DSTARC star bus [23]. The 
output serializer/deserializer (OSERDES) and corresponding input serializer/deserializer (ISERDES) modules 
in Fig. 2 (a) are used for data serialization and deserialization, respectively [24]. The differential input buffer 
(IBUFDS) and differential output buffer (OBUFDS) modules are used for single-ended to differential voltage 
level conversion. Input delay resource (IDELAY) chains are programmable and play a core role in the self- 


adaptive serial link. They are located on the input path of the FPGA pins, and each includes 64 delay elements 


[25]. By changing the length of the delay chains at the receiver end, the phase relationship between the data 
and the clock can be adjusted, ensuring the proper sampling of the data by the clock. 

Since this study adopts a strategy of collecting data first and then performing position reconstruction, there 
is no special requirement for the link rate. This study opts for a clock with a frequency of 62.5 MHz, which is 
half of the FPGA device clock. Thus, the corresponding serial aggregation link rate is 625 Mbps. 


Before the DABs can send parallel data to the CCB, an initialization process is required to establish the 
connection of the link. The initialization process involves the transmission of handshake signals through the 
PXI_STAR bus. Fig. 2 (b) shows the following steps for establishing the connection of the self-adaptive serial 
link. 


¢ Step 1: The DAB sends a K28.5 control code stream through the PXIe_ DSTARC bus while the CCB 
keeps the PXI_STAR bus at a low voltage level. 

e Step 2: The CCB utilizes the ISERDES module to detect and locate the boundaries of the serial data 
stream for checking the recovery of the K-code. If the CCB fails to recover the K28.5 code, the system 
proceeds to Step 3; otherwise, it proceeds to Step 4. 

e Step 3: The CCB dynamically adjusts the IDELAY chain length by adding a delay element to the 
IDELAY chain. Then, the system proceeds to Step 1. 

e Step 4: If a certain number of consecutive K28.5 control codes are received, the system proceeds to 
Step 5; otherwise, it returns to Step 3. 

e Step 5: The CCB sets the PXI_STAR to a high voltage level, and the DAB sends a sequence of 
incrementing codes (0x00 to OxFF). Then, the system proceeds to Step 6. The reason for sending 
incremental codes from 0x00 to OxFF is that the receiving end may be in a metastable state in which 
it can correctly receive the K28.5 control code but not some other data. The incremental codes help 
detect whether the receiving end is in a stable state. 

e Step 6: If the incrementing code is correctly received by the CCB, the system proceeds to Step 7; 
otherwise, it proceeds to Step 3. 

e Step 7: The initialization of the link is complete. The link is now ready for data transmission. The 
system will continue to transmit the K28.5 control code and keep the PXI_STAR signal high when 
the link is idle. 
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Fig. 2 Data aggregation built on the self-adaptive serial link technique. (a) Composition of the data 
aggregation link. (b) Flow chart of the initialization of the self-adaptive serial link. (c) Data aggregation 
process between the CCB and DABs. 

To accommodate the limited on-chip storage resources of the FPGA, this study adopts a time slicing 
approach for data aggregation. The data aggregation process between the CCB and DABs is depicted in Fig. 2 
(c). Once data acquisition is completed, the CCB sends a data read request. Next, each DAB reads the waveform 
data from the onboard DDR (double data rate) memory, where the waveform data of over-threshold valid 
signals are preserved in time order. The waveform data are then further processed to extract the signal's 
amplitude, timestamp, and channel number information, which is next stored in a first-in, first-out (FIFO). The 
DAB reads the data corresponding to the next time slice from the FIFO and sends them to the CCB. 
Subsequently, the CCB stores the data from all 12 channels into a width conversion FIFO. This width 
conversion step ensures that the port transmission capacity of the 12:1 switch is consistent, preventing data 


congestion during the aggregation process. Finally, the CCB polls the width conversion FIFO to retrieve the 


data and writes the signal information into two separate FIFOs: one for the X-direction and one for the Y- 
direction. After completing the data aggregation of a time slice, the CCB proceeds with subsequent data 
processing operations. Once finished, it sends a data read request for the next time slice in preparation for the 


next round of processing. 


2.2 Fast sorting and coordinate matching technique based on timestamp to RAM address 


mapping 


Due to independent acquisition among DABs, the data aggregated into the X- and Y-direction FIFOs are 
time-disordered. Therefore, this study proposes a fast sorting technique based on the mapping between signal 
timestamps and random access memory (RAM) addresses, as described in Fig. 3 (a). The timestamp of a signal 
is mapped to the address of an FPGA on-chip RAM, and the channel number and amplitude of the signal are 
the contents of the RAM storage. Fig. 3 (a) provides an intuitive understanding, where time-ordered signals on 
the time axis are rotated by 90 degrees, resulting in a structure that closely resembles that of the RAM. 

The depth of the RAM which stores the data of a single time slice should be equal to the length of the time 
slice. The timestamp to RAM address mapping uses the lower N bits of the timestamp as the address of the 
RAM, where N and the depth of the RAM (Depth) follow a relationship of 2 = Depth. This allows the N-bit 
timestamp to be directly mapped to the entire circular RAM address. The sorting module writes the data to the 
first-level RAM based on the mapped address. A RAM address unit corresponds to a timestamp unit of 10 ns, 
which is the sampling interval of the ADC. Considering FPGA resource utilization, the RAM depth is set to 
2048. Hence, the time slice length is 20.48 us. The probability of two signal timestamps being the same is low, 
so no other operations are needed to handle this situation. 

After sorting and merging processes in the X- and Y-directions, a further coordinate matching operation is 
required. Coordinate matching follows two principles: time alignment and nearest matching. Time alignment 
means that the time difference between the timestamps of the X-direction and Y-direction signals should be 
smaller than the predefined time window. Nearest matching means that if two or more signals are found within 
the time window, the X and Y direction signals with the closest timestamps are matched. 

A typical coordinate matching process proceeds as follows. For each X-direction signal, search for the Y- 
direction signal that has the smallest time difference. If the time difference between this pair of X and Y signals 
falls within the time window, generate a particle coordinate. Otherwise, the X-direction signal is considered as 
an isolated signal without a matching Y-direction signal. Then, continue processing the next X-direction signal 
until all X signals have been traversed. For each X-direction signal, this coordinate matching process requires 
a search through the Y-direction signals, resulting in a loop traversal operation. Therefore, the overall time 
complexity of this method is O(n’). 

This study proposes a fast coordinate matching technique based on the previous timestamp to RAM 
address mapping, as shown in Fig. 3(b). The merged signals are stored in the second-level RAM, with addresses 
still following the timestamp to RAM address mapping relationship. The X-direction and Y-direction signals 
share the same RAM address reference, which means they are aligned on the time axis. Therefore, when 
searching for a matching Y-direction signal for each X-direction signal, it is only necessary to expand a range 


of RAM addresses as wide as the time window in the Y-direction RAM and find the nearest non-empty address 


within the time window. This address corresponds to the matched Y-direction signal. The coordinate matching 
for all signals can be completed by traversing through the second-level RAM of the X-direction. The 
reconstructed event information is then stored in a FIFO buffer and transmitted to the chassis controller via the 
peripheral component interconnect express (PCIe) bus. 

The fast sorting and coordinate matching technique based on the timestamp to RAM address mapping 
offers the advantages of low time complexity and simple implementation. On one hand, this technique avoids 
complex conventional algorithms such as bubble sort and traversal search. Thus, the time complexity is reduced 
from O(n’) to O(n). On the other hand, this technique streamlines sorting and searching into basic RAM read- 


write operations and is relatively simple to implement on an FPGA. 
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Fig. 3 Fast sorting and coordinate matching technique based on timestamp to RAM address mapping. (a) 
Sorting process. (b) Matching process. 


2.3 Precise start point merging technique utilizing a circular combined RAM 


According to the uTPC principle, after the sorting operation, it is necessary to merge the groups of signals 


that have close channel numbers and timestamps into the starting signal of each group. When merging the 


signals at the end of each time slice, their starting signals may exist at the beginning of the next time slice. 

This study designs a circular combined RAM structure to effectively solve this problem, as shown in Fig. 
4 (a). When merging the signals at the tail of one RAM slice, the signals at the head of the other RAM slice can 
be read to determine whether mergence is needed. The merged signals are then written into the second-level 
RAM for subsequent coordinate matching operations. 

RAM is read and written by both the upstream and downstream modules. It is crucial to design a well- 
coordinated timing relationship to prevent conflicts between RAM read and write operations. To address this 
issue, dual-port on-chip RAMs are utilized. The read and write operations of the RAM are coordinated 
depending on the status signals from the upstream and downstream modules. The explanation below focuses 
on the first-level RAM involved in the sorting and merging modules, but the same principles also apply to the 
second-level RAM used in the merging and matching modules. 

The first-level circular RAM consists of sections A and B, and each section corresponds to a single time 
slice. At the initial state when the RAM is empty, the sorting module writes data into sections A and B. It then 
sends a sorting completion signal to the merging module. After the initial state, every time the sorting module 
finishes writing a time slice of data into one section of the RAM, it sends a sorting completion signal to the 
merging module. Upon receiving the sorting completion signal, the merging module performs the merging 
operation on the other section of the RAM and sends a merging completion signal to the sorting module. It is 
important to note that the section of the RAM used for merging in each iteration is different from the section 
where the sorting module has just written. The two sections of RAM are used in a ping-pong-like manner, as 
depicted in Fig. 4 (a). This process is repeated to complete the merging of signals for all time slices. Furthermore, 
taking advantage of the parallel computing capabilities of FPGAs, the signal merging operations can be 
performed simultaneously in both the X-direction and Y-direction. Fig. 4 illustrates only one direction of the 
merging operation, but the same operations are carried out in the other direction, as well. 

The number of readout strips hit by charged particles varies, depending on the particle’s emission angle. 
Therefore, it is uncertain how many other signals need to be read when merging each signal. To address this 
issue, this study adopts a successive merging approach based on the circular combined RAM, as illustrated in 
Fig. 4 (b). During the mergence of a signal, a time window is opened, starting from this signal and extending 
toward the later time direction. Within the time window, only the first signal that closely matches the channel 
number of the current signal is sought. If one such signal is found, there is no need to check for other signals 
within the time window. The current signal is merged into the later signal, updating its amplitude as the sum of 
the two and incrementing the merged channel count by one. The amplitude and the merged channel count 
information could provide support for further event selection. If no candidate is found to be merged within the 
time window, then the current signal is the starting signal, and the signal is written to the same address in the 
second-level RAM. 

Considering the presence of many empty addresses in RAM, it is necessary to skip these blank addresses 
to improve data processing efficiency. This applies not only to the merging module but also to the coordinate 
matching module. To address this, this study adopts a dynamic indexing register to mark whether each address 
in RAM contains valid data, enabling the blank addresses to be skipped. Taking the first-level RAM shown in 
Fig. 4 as an example, the bit number of the indexing register is equal to the RAM depth, where each bit 


corresponds to one address in RAM. During the process of time sorting, if a specific address in RAM is written 


with data, the corresponding bit in the register is set to 1. While traversing the valid data in RAM to merge 
signals, the multiple branch selection statement (case x statement in the Verilog language) is used to quickly 
locate the first non-zero bit in the indexing register. This corresponds to the first non-empty storage address in 
RAM. Once the merging of the signal at the corresponding address is completed, the corresponding bit in the 
indexing register is set to 0. This process is repeated, and as the merging of signals in the entire time slice is 


completed, all bits in the indexing register are dynamically cleared to zero. 
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Fig. 4 Precise start point merging technique utilizing a circular combined RAM. (a) The circular combined 
RAM structure for precisely merging signals at the boundary of two time slices. (b) The successive merging 
process. 


3. Tests and verification 


3.1 Self-adaptive serial link test 


The self-adaptive serial link is the foundation for position reconstruction. Hence, a test of the reliability of 
the link is needed. The performance test of the serial link is carried out from two perspectives: eye diagram and 
bit error rate (BER). 

The eye diagram of the digital signal reflects its overall reliability and quality [26]. The test result, as 


shown in Fig. 5, indicates a clear eye diagram with a wide opening range. The eye width measures 


approximately 1400 ps, which indicates the excellent quality of the serial link. 

A further BER test was carried out to verify whether the method of dynamically adjusting the delay chain 
length can achieve accurate clock sampling of serial data, thus examining the feasibility of the self-adaptive 
serial link. In the BER test, the DAB sends a certain number of pseudo-random number sequences while the 
CCB recovers the pseudo-random numbers and compares them with locally generated pseudo-random numbers 
in the same mode. Equation (1) shows the relationship between the link error rate and confidence level, 
according to the hypothesis testing theory in statistics. In the equation, n and k represent the number of bits in 
the pseudo-random sequence and the number of detected errors, respectively. CL denotes the confidence level 
at which the link error rate is less than p [27]. The BER test process lasted for approximately 3 hours, during 
which a pseudo-random number sequence of 6.75* 10" bits was transmitted with the link rate set to 625 Mbps. 
No errors were detected in the test procedure. It can be inferred that the BER is less than 10°!” at a confidence 
level of 99%, meeting the reliability standards of conventional high-speed serial links, such as the PCIe bus. 
This shows that despite the inherent lack of synchronization between the local receiver clock and the incoming 
serial data stream, the self-adaptive serial link protocol proposed in this study allows the receiver to dynamically 
adjust the lengths of the programmable delay chains. This automatically fine-tunes the phase relationship 
between the serial data stream and the clock, ultimately enabling the asynchronous receiver clock to perform 
precise sampling of the serial data stream. In summary, the experimental results indicate the capability of the 
self-adaptive serial link to ensure stable and dependable data transmission. 

The notable advantages of the self-adaptive serial link lie in the simplicity of physical link connections 
and the ability to easily implement link protocols using common FPGA logic resources. Consequently, this 
addresses a constraint encountered in our study, namely, the quantity limitation of backplane buses on the PXIe 
platform. By contrast, conventional synchronous parallel transmission methods necessitate dedicated clock 
signal paths and multiple parallel data links at both the transmitter and receiver ends. Further, traditional high- 
speed serial transmission methods based on gigabit transceivers (GT) consume the scarce GT core resources of 
an FPGA, making them unsuitable for aggregating data across a dozen or more circuit boards. 
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Fig. 5 Eye diagram of the self-adaptive serial link. 


3.2 Position resolution experiment 


The position resolution experiment was conducted with a standard Am-Be neutron source and a known 
semi-circular °LiF conversion film affixed to the Micromegas detector. Due to the relatively low intensity of 
the Am-Be neutron source, in this particular experiment, a thick (2 um) pure °LiF conversion film was utilized. 
The imaging result of the Am-Be neutron radiation field is shown in Fig. 6 (a). The spatial resolution of the 
system can be estimated by fitting the neutron count distribution at the semi-circle's diameter boundary. The 
image is further rotated by 45°, and neutron counts are accumulated along the diameter boundary of the 
semicircle. 

The experimental distribution B(x’) of the accumulated hits at the semicircle's diameter boundary is the 
convolution of the actual particle distribution T(x) and the system's Gaussian response function R(x,x’), as 
shown in Equation (2) and Equation (3) [28-29]. Positional resolution tests for micropattern gas detectors 
typically employ specific shaping components to adjust the spatial distribution T(x) of particles entering the 
detector into the expected distribution. Since neutrons are electrically neutral, the conversion film itself can 
serve as the shaping component. Narrow slits or blades are the most commonly used shaping components 
because the expected distributions of neutrons passing through them are simple 6 and step distributions, 
respectively. For simplicity in the experiment, an existing circular conversion film was divided into two halves, 
and the diameter of a semicircle could be regarded as a blade. Consequently, the actual distribution of incident 
particles at the diameter conforms to a step distribution. Furthermore, the experimental distribution B(x’) is 
simplified as a cumulative Gaussian distribution, as shown in Equation (4). By fitting the experimental data as 
shown in Fig. 6 (b), the standard deviation (c) of the fit can be obtained, which indicates a spatial resolution of 
approximately 1.4 mm. As mentioned in the introduction, conventional measurement methods mainly rely on 
point-by-point scanning with active detectors or a spatial arrangement of passive detectors. The accuracy of 
these methods is approximately 5 mm to 1 cm [30-33]. In comparison, the method proposed in this study 


improves the measurement accuracy to 1.4 mm. 


B(x')= VTG) - R(x, x')dx (2) 

, 1 (x-x') 
R(x, x’) = cae 3 
(x x’) =a exp wal (3) 
Ba) =f? Rd (4) 


The high spatial resolution exhibited in the experiment results mainly arises from two key factors. 
Primarily, this study employs a Micromegas micropattern detector with a pitch of only 1.5-mm between the 
readout strips. According to the principles of the uniform distribution, it can be inferred that the strip pitch 
determines the theoretical limit of the system’s spatial resolution to be approximately 0.43 mm (standard 
deviation). Moreover, the FPGA-based hardware position reconstruction algorithm proposed in this paper is 
founded on the uTPC principle. As discussed in the introduction, the uTPC principle takes the Micromegas 
detector as a micro time-projection chamber to enable the reconstruction of the starting point of the particle's 
initial ionization drift path through the temporal order of multiple induced signals. Previous research has 


indicated that the uTPC principle offers superior position reconstruction accuracy compared with traditional 


centroid methods [19, 20]. 
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Fig. 6 Experimental results of position resolution. (a) Imaging result of the Am-Be neutron radiation 
field. (b) Fitting result with the convolution function of the step and Gaussian distributions. 


3.3 BNCT beam experiment 


The beam experiment was conducted on a dedicated BNCT reactor neutron source, namely, the in-hospital 
neutron irradiator at the China Institute of Atomic Energy. The theoretical flux of the neutron beam is up to 10° 
n-cm*:s"! [34]. Considering the high beam flux, an ultra-thin (30 nm) conversion film was employed in the 
experiment, and natural LiF, rather than purified °LiF, was adopted as the conversion material to minimize 
conversion efficiency. Fig. 7 (a) displays a photograph of the experimental setup, and Fig. 7 (b) shows the 
spatial distribution of thermal neutron components in the beam obtained with the conversion film. 

It can be observed that the profile of the neutron beam exhibits a circular profile with a diameter of 
approximately 12 cm, which conforms to the shape of the beam outlet. The evenly distributed light-colored 
grid points result from the mesh pillars of the Micromegas detector. The spacing between the mesh pillars is 
approximately 10 mm, and the diameter is approximately 1 mm, which is consistent with the measurement 
results. The large-area ™'LiF film for the thermal neutron detector is composed of four 10 cm x 10 cm square 
films. Therefore, the slightly lower neutron counts at the junction of the "LiF films and the 20 cm x 20 cm 
square boundary outside the beam profile can be seen. 

During the beam test, the position reconstruction results are obtained immediately after data acquisition is 
completed. Compared to the measurement times of hours for traditional point-by-point scanning with active 
detectors or a spatial arrangement of passive detectors, this study improves the equipment's practicality and 
measurement efficiency. 

Additionally, the spatial distribution of the neutron beam flux in BNCT is concerned with the relative 
distribution rather than the absolute value of the neutron flux. This parameter will serve as a relative coefficient 
input into the future BNCT treatment planning system. Thus, practical numerical values are not presented here. 
If accurate absolute fluence values for specific points can be measured in the future, in conjunction with the 
spatial distribution measurement method demonstrated in this paper for relative fluence, a complete two- 
dimensional absolute fluence distribution can be obtained. Moreover, due to the challenge of producing a large, 


uniformly ultra-thin (30 nm) "LiF film, preliminary tests were conducted using four 10 cm x 10 cm square 


films pieced together. Uniformity calibration of the films was not performed in the preliminary tests, making it 
difficult to obtain comprehensive and precise uniformity values of the beam flux at this stage. The nuclear 
reactor is normally shut down, and the availability of beam time for experiments is limited. Therefore, these 
limitations will be addressed further in future system-level work. 

In summary, although there are some imperfections in the test results, the preliminary beam experiment 
has confirmed the online rapid measurement capability of the FPGA-based particle position reconstruction 
method proposed in this paper. Furthermore, the details of the beam experiment results, along with the results 


of the position resolution experiment, indicate the system's excellent spatial resolution. 
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Fig. 7 (a) Photograph of the neutron beam experiment. (b) Spatial distribution of thermal neutrons 
obtained with the "LiF conversion film. 


4. Conclusion 


This paper proposes an FPGA-based online position reconstruction method, rooted in the uTPC principle, 
that effectively eliminates the need for lengthy offline data processing, thereby enhancing equipment 
practicality and improving measurement efficiency. This study contributes to advancing the application of 
Micromegas detectors in measuring the flux spatial distributions of BNCT neutron beams, consequently 
forming a rapid and precise measurement technique. In response to the difficulty faced by FPGAs in sorting 
and searching operations, this study presents a rapid sorting and searching technique based on timestamp to 
RAM address mapping. In comparison to traditional approaches such as bubble sorting and traversal searching, 
the proposed method simplifies the design of logic firmware and reduces time complexity from O(n?) to O(n). 
In addition, given the prevalence of timestamps as common pieces of information in physics experiments, the 
proposed method can provide valuable insights and references for researchers who need to execute timestamp- 
related algorithms within an FPGA. 
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